{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To run this, press \"*Runtime*\" and press \"*Run all*\" on a **free** Tesla T4 Google Colab instance!\n",
    "<div class=\"align-center\">\n",
    "<a href=\"https://unsloth.ai/\"><img src=\"https://github.com/unslothai/unsloth/raw/main/images/unsloth%20new%20logo.png\" width=\"115\"></a>\n",
    "<a href=\"https://discord.gg/unsloth\"><img src=\"https://github.com/unslothai/unsloth/raw/main/images/Discord button.png\" width=\"145\"></a>\n",
    "<a href=\"https://docs.unsloth.ai/\"><img src=\"https://github.com/unslothai/unsloth/blob/main/images/documentation%20green%20button.png?raw=true\" width=\"125\"></a></a> Join Discord if you need help + \u2b50 <i>Star us on <a href=\"https://github.com/unslothai/unsloth\">Github</a> </i> \u2b50\n",
    "</div>\n",
    "\n",
    "To install Unsloth on your own computer, follow the installation instructions on our Github page [here](https://docs.unsloth.ai/get-started/installing-+-updating).\n",
    "\n",
    "You will learn how to do [data prep](#Data), how to [train](#Train), how to [run the model](#Inference), & [how to save it](#Save)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0a34c565",
   "metadata": {},
   "source": [
    "### News"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "95f35cc3",
   "metadata": {},
   "source": [
    "Unsloth now supports Text-to-Speech (TTS) models. Read our [guide here](https://docs.unsloth.ai/basics/text-to-speech-tts-fine-tuning).\n",
    "\n",
    "Read our **[Gemma 3N Guide](https://docs.unsloth.ai/basics/gemma-3n-how-to-run-and-fine-tune)** and check out our new **[Dynamic 2.0](https://docs.unsloth.ai/basics/unsloth-dynamic-2.0-ggufs)** quants which outperforms other quantization methods!\n",
    "\n",
    "Visit our docs for all our [model uploads](https://docs.unsloth.ai/get-started/all-our-models) and [notebooks](https://docs.unsloth.ai/get-started/unsloth-notebooks).\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "cbb081dc",
   "metadata": {},
   "source": [
    "### Installation"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4c14919f",
   "metadata": {},
   "outputs": [],
   "source": "%%capture\nimport os\nif \"COLAB_\" not in \"\".join(os.environ.keys()):\n    !pip install unsloth\nelse:\n    # Do this only in Colab notebooks! Otherwise use pip install unsloth\n    !pip install --no-deps bitsandbytes accelerate xformers==0.0.29.post3 peft trl triton cut_cross_entropy unsloth_zoo\n    !pip install sentencepiece protobuf \"datasets>=3.4.1,<4.0.0\" huggingface_hub hf_transfer\n    !pip install --no-deps unsloth"
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "df0a3fbb",
   "metadata": {},
   "outputs": [],
   "source": [
    "!pip install -qU llama-index llama-index-packs-raft-dataset"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "895f184c",
   "metadata": {},
   "source": [
    "### Unsloth"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "47bdb015",
   "metadata": {},
   "source": [
    "### Retrieval Augmented Finetuning (RAFT) Cookbook Recipe!\n",
    "This cookbook aims to show how to use Unsloth to use retrieval augmented finetuning (RAFT). Supervised finetuning is like a closed-book examination where we encode knowledge from the training dataset into the LLM during finetuning, and then test it on unseen examples in the \"exam\".\n",
    "\n",
    "RAFT differs from this in that it is an open-book exam format of finetuning! We allow the LLM to see not just the question and answer (in chain-of-thought format), but also the contexts. The hope is that the LLM will be able to acquire the domain knowledge, but also an improved ability to synthesize answers from context.\n",
    "\n",
    "> Reference: [RAFT: Adapting Language Model to Domain Specific RAG](https://arxiv.org/abs/2403.10131)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "eb395d06",
   "metadata": {},
   "source": [
    "### Code Setup "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "09217750",
   "metadata": {},
   "source": [
    "First, let's setting up the OPENAI API KEY so that we can use the OpenAI LLMs. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "237e4938",
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "\n",
    "os.environ[\"OPENAI_API_KEY\"] = \"your-openai-api-key\""
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9bf39a16",
   "metadata": {},
   "source": [
    "Next, we'll set up LlamaIndex. This involves configuring the language model (LLM) and embedding model that LlamaIndex will use. We'll be using OpenAI's `gpt-4o` as our LLM and `text-embedding-ada-002` as our embedding model."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ffbaeee6",
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.core import (\n",
    "    Settings,\n",
    "    SimpleDirectoryReader,\n",
    ")\n",
    "from llama_index.llms.openai import OpenAI\n",
    "from llama_index.embeddings.openai import OpenAIEmbedding\n",
    "\n",
    "Settings.llm = OpenAI(model=\"gpt-4o\")\n",
    "Settings.embed_model = OpenAIEmbedding(model=\"text-embedding-ada-002\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d23e3f89",
   "metadata": {},
   "source": [
    "### Ingest documents "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b087a668",
   "metadata": {},
   "source": [
    "We'll use the following code to download a research paper and then load it using `SimpleDirectoryReader`. This will be the data we use for our retrieval augmented finetuning."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "f3248b2f",
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Loading files:   0%|          | 0/1 [00:00<?, ?file/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Loading files: 100%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 1/1 [00:00<00:00,  1.19file/s]\n"
     ]
    }
   ],
   "source": [
    "!mkdir  -p ../data\n",
    "!wget \"https://arxiv.org/pdf/2405.00247.pdf\" -O \"../data/non_traditional_credentials.pdf\"\n",
    "\n",
    "docs = SimpleDirectoryReader(\"../data/\").load_data(show_progress=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0bdfe6e6",
   "metadata": {},
   "source": [
    "### Retrieval Augmented Finetuning"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2d4ef8f1",
   "metadata": {},
   "source": [
    "### Getting the RAFT dataset\n",
    "LlamaIndex has very kindly adapted the source code of the RAFT repository and made it even easier to generate your own RAFT dataset. Just point it to your filepath.t\n",
    "> Reference: [RAFTDatasetPack](https://github.com/run-llama/llama_index/blob/main/llama-index-packs/llama-index-packs-raft-dataset/examples/raft_dataset.ipynb)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "id": "6f541fca",
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.packs.raft_dataset import RAFTDatasetPack\n",
    "\n",
    "raft_dataset = RAFTDatasetPack(\n",
    "    file_path = \"../data/non_traditional_credentials.pdf\",\n",
    "    llm = Settings.llm,\n",
    "    embed_model=Settings.embed_model\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1c5b4f17",
   "metadata": {},
   "source": [
    "This cell takes quite long to run! Go have a coffee \u2615\n",
    "> It took 19 minutes for the cell to finish running"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "id": "e192e2b7",
   "metadata": {},
   "outputs": [],
   "source": [
    "dataset = raft_dataset.run()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b4694fc1",
   "metadata": {},
   "source": [
    "Let's have a look!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "id": "5a1f4bd7",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>id</th>\n",
       "      <th>type</th>\n",
       "      <th>question</th>\n",
       "      <th>context</th>\n",
       "      <th>oracle_context</th>\n",
       "      <th>cot_answer</th>\n",
       "      <th>instruction</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>seed_task_0</td>\n",
       "      <td>general</td>\n",
       "      <td>What percentage increase in credential sharing...</td>\n",
       "      <td>{'sentences': [['The value of non-traditional ...</td>\n",
       "      <td>The value of non-traditional credentials in th...</td>\n",
       "      <td>assistant: To determine the percentage increas...</td>\n",
       "      <td>&lt;DOCUMENT&gt;The value of non-traditional credent...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>seed_task_1</td>\n",
       "      <td>general</td>\n",
       "      <td>How much more likely were learners in the trea...</td>\n",
       "      <td>{'sentences': [['The control group did not rec...</td>\n",
       "      <td>The value of non-traditional credentials in th...</td>\n",
       "      <td>assistant: To answer the question \"How much mo...</td>\n",
       "      <td>&lt;DOCUMENT&gt;The control group did not receive th...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>seed_task_2</td>\n",
       "      <td>general</td>\n",
       "      <td>What was the increase in jobs related to the c...</td>\n",
       "      <td>{'sentences': [['The value of non-traditional ...</td>\n",
       "      <td>The value of non-traditional credentials in th...</td>\n",
       "      <td>assistant: To determine the increase in jobs r...</td>\n",
       "      <td>&lt;DOCUMENT&gt;The value of non-traditional credent...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>seed_task_3</td>\n",
       "      <td>general</td>\n",
       "      <td>Which group of LinkedIn users showed a more pr...</td>\n",
       "      <td>{'sentences': [['The value of non-traditional ...</td>\n",
       "      <td>The value of non-traditional credentials in th...</td>\n",
       "      <td>assistant: To determine which group of LinkedI...</td>\n",
       "      <td>&lt;DOCUMENT&gt;The value of non-traditional credent...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>seed_task_4</td>\n",
       "      <td>general</td>\n",
       "      <td>What platform were the courses completed on fo...</td>\n",
       "      <td>{'sentences': [['Analogously, Past\n",
       "Managerial ...</td>\n",
       "      <td>The value of non-traditional credentials in th...</td>\n",
       "      <td>assistant: To answer the question \"What platfo...</td>\n",
       "      <td>&lt;DOCUMENT&gt;Analogously, Past\\nManagerial Job fo...</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "            id     type                                           question  \\\n",
       "0  seed_task_0  general  What percentage increase in credential sharing...   \n",
       "1  seed_task_1  general  How much more likely were learners in the trea...   \n",
       "2  seed_task_2  general  What was the increase in jobs related to the c...   \n",
       "3  seed_task_3  general  Which group of LinkedIn users showed a more pr...   \n",
       "4  seed_task_4  general  What platform were the courses completed on fo...   \n",
       "\n",
       "                                             context  \\\n",
       "0  {'sentences': [['The value of non-traditional ...   \n",
       "1  {'sentences': [['The control group did not rec...   \n",
       "2  {'sentences': [['The value of non-traditional ...   \n",
       "3  {'sentences': [['The value of non-traditional ...   \n",
       "4  {'sentences': [['Analogously, Past\n",
       "Managerial ...   \n",
       "\n",
       "                                      oracle_context  \\\n",
       "0  The value of non-traditional credentials in th...   \n",
       "1  The value of non-traditional credentials in th...   \n",
       "2  The value of non-traditional credentials in th...   \n",
       "3  The value of non-traditional credentials in th...   \n",
       "4  The value of non-traditional credentials in th...   \n",
       "\n",
       "                                          cot_answer  \\\n",
       "0  assistant: To determine the percentage increas...   \n",
       "1  assistant: To answer the question \"How much mo...   \n",
       "2  assistant: To determine the increase in jobs r...   \n",
       "3  assistant: To determine which group of LinkedI...   \n",
       "4  assistant: To answer the question \"What platfo...   \n",
       "\n",
       "                                         instruction  \n",
       "0  <DOCUMENT>The value of non-traditional credent...  \n",
       "1  <DOCUMENT>The control group did not receive th...  \n",
       "2  <DOCUMENT>The value of non-traditional credent...  \n",
       "3  <DOCUMENT>The value of non-traditional credent...  \n",
       "4  <DOCUMENT>Analogously, Past\\nManagerial Job fo...  "
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import pandas as pd\n",
    "df = pd.DataFrame(dataset)\n",
    "df.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "id": "9b85ea48",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/markdown": [
       "<DOCUMENT>The value of non-traditional credentials in the labor market*\n",
       "Susan Athey & Emil Palikot\n",
       "May 2, 2024\n",
       "Abstract\n",
       "This study investigates the labor market value of credentials obtained from Massive Open On-\n",
       "line Courses (MOOCs) and shared on business networking platforms. We conducted a random-\n",
       "ized experiment involving more than 800,000 learners, primarily from developing countries and\n",
       "without college degrees, who completed technology or business-related courses on the Coursera\n",
       "platform between September 2022 and March 2023. The intervention targeted learners who had\n",
       "recently completed their courses, encouraging them to share their credentials and simplifying the\n",
       "sharing process. One year after the intervention, we collected data from LinkedIn profiles of ap-\n",
       "proximately 40,000 experimental subjects. We find that the intervention leads to an increase of 17\n",
       "percentage points for credential sharing. Further, learners in the treatment group were 6% more\n",
       "likely to report new employment within a year, with an 8% increase in jobs related to their certifi-\n",
       "cates. This effect was more pronounced among LinkedIn users with lower baseline employability.\n",
       "Across the entire sample, the treated group received a higher number of certificate views, indicat-\n",
       "ing an increased interest in their profiles. These results suggest that facilitating credential sharing\n",
       "and reminding learners of the value of skill signaling can yield significant gains. When the ex-\n",
       "periment is viewed as an encouragement design for credential sharing, we can estimate the local\n",
       "average treatment effect (LATE) of credential sharing (that is, the impact of credential sharing on\n",
       "the workers induced to share by the intervention) for the outcome of getting a job. The LATE esti-\n",
       "mates are imprecise but large in magnitude; they suggest that credential sharing more than doubles\n",
       "the baseline probability of getting a new job in scope for the credential.\n",
       "*We thank Eric Karsten and his team in Coursera for collaborating on this project. </DOCUMENT>\n",
       "<DOCUMENT>13 p.p.) and 36 p.p. (S.E. </DOCUMENT>\n",
       "<DOCUMENT>), which corresponds to a\n",
       "17% increase from baseline. The remaining columns present estimates from the instrumental variable\n",
       "regression with New Job and New Job in Scope as outcomes. In Columns 6, 7, and 8, we restrict attention\n",
       "to jobs reported with a starting date at least four months after treatment. We estimate positive and\n",
       "statistically significant effects. Specifically, we estimate the local average treatment effect of 0.24 (S.E.\n",
       "0.13) for any new job starting at least one month after treatment and 0.36 (S.E. 0.12) when restricting\n",
       "14</DOCUMENT>\n",
       "What percentage increase in credential sharing was observed after the intervention?"
      ],
      "text/plain": [
       "<IPython.core.display.Markdown object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "from IPython.display import display, Markdown\n",
    "\n",
    "display(Markdown(df.iloc[0]['instruction']))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "id": "08098b01",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/markdown": [
       "The value of non-traditional credentials in the labor market*\n",
       "Susan Athey & Emil Palikot\n",
       "May 2, 2024\n",
       "Abstract\n",
       "This study investigates the labor market value of credentials obtained from Massive Open On-\n",
       "line Courses (MOOCs) and shared on business networking platforms. We conducted a random-\n",
       "ized experiment involving more than 800,000 learners, primarily from developing countries and\n",
       "without college degrees, who completed technology or business-related courses on the Coursera\n",
       "platform between September 2022 and March 2023. The intervention targeted learners who had\n",
       "recently completed their courses, encouraging them to share their credentials and simplifying the\n",
       "sharing process. One year after the intervention, we collected data from LinkedIn profiles of ap-\n",
       "proximately 40,000 experimental subjects. We find that the intervention leads to an increase of 17\n",
       "percentage points for credential sharing. Further, learners in the treatment group were 6% more\n",
       "likely to report new employment within a year, with an 8% increase in jobs related to their certifi-\n",
       "cates. This effect was more pronounced among LinkedIn users with lower baseline employability.\n",
       "Across the entire sample, the treated group received a higher number of certificate views, indicat-\n",
       "ing an increased interest in their profiles. These results suggest that facilitating credential sharing\n",
       "and reminding learners of the value of skill signaling can yield significant gains. When the ex-\n",
       "periment is viewed as an encouragement design for credential sharing, we can estimate the local\n",
       "average treatment effect (LATE) of credential sharing (that is, the impact of credential sharing on\n",
       "the workers induced to share by the intervention) for the outcome of getting a job. The LATE esti-\n",
       "mates are imprecise but large in magnitude; they suggest that credential sharing more than doubles\n",
       "the baseline probability of getting a new job in scope for the credential.\n",
       "*We thank Eric Karsten and his team in Coursera for collaborating on this project. "
      ],
      "text/plain": [
       "<IPython.core.display.Markdown object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "display(Markdown(df.iloc[0]['oracle_context']))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "id": "b02a7419",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "c6bddf7b4fd1411eb08901dbb3ffddda",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Creating json from Arrow format:   0%|          | 0/1 [00:00<?, ?ba/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/plain": [
       "2966201"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Save as .jsonl format\n",
    "dataset.to_json(\"raft_train.jsonl\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "231940bb",
   "metadata": {},
   "source": [
    "### Training the LLM\n",
    "Our dataset is a HuggingFace `Dataset` object, so we can leverage the abstraction's advantage to do a train-test split"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "id": "050ba451",
   "metadata": {},
   "outputs": [],
   "source": [
    "splits = dataset.train_test_split(test_size=0.1)\n",
    "train_ds = splits[\"train\"]\n",
    "eval_ds  = splits[\"test\"]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "id": "099f990b",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(Dataset({\n",
       "     features: ['id', 'type', 'question', 'context', 'oracle_context', 'cot_answer', 'instruction'],\n",
       "     num_rows: 301\n",
       " }),\n",
       " Dataset({\n",
       "     features: ['id', 'type', 'question', 'context', 'oracle_context', 'cot_answer', 'instruction'],\n",
       "     num_rows: 34\n",
       " }))"
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "train_ds, eval_ds"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "34f73d1e",
   "metadata": {},
   "source": [
    "### Now let's get the model!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9d5317f2",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\ud83e\udda5 Unsloth: Will patch your computer to enable 2x faster free finetuning.\n",
      "\ud83e\udda5 Unsloth Zoo will now patch everything to make training faster!\n",
      "INFO 05-21 06:09:36 [importing.py:53] Triton module has been replaced with a placeholder.\n",
      "INFO 05-21 06:09:36 [__init__.py:239] Automatically detected platform cuda.\n",
      "==((====))==  Unsloth 2025.4.7: Fast Llama patching. Transformers: 4.51.3. vLLM: 0.8.5.post1.\n",
      "   \\\\   /|    NVIDIA A10G. Num GPUs = 1. Max memory: 22.184 GB. Platform: Linux.\n",
      "O^O/ \\_/ \\    Torch: 2.6.0+cu124. CUDA: 8.6. CUDA Toolkit: 12.4. Triton: 3.2.0\n",
      "\\        /    Bfloat16 = TRUE. FA [Xformers = 0.0.29.post2. FA2 = False]\n",
      " \"-____-\"     Free license: http://github.com/unslothai/unsloth\n",
      "Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored!\n"
     ]
    }
   ],
   "source": [
    "from unsloth import FastLanguageModel\n",
    "import torch\n",
    "\n",
    "model, tokenizer = FastLanguageModel.from_pretrained(\n",
    "    model_name = \"unsloth/Llama-3.2-1B-Instruct\",\n",
    "    max_seq_length = 2048, # Choose any for long context!\n",
    "    load_in_4bit = True,  # 4 bit quantization to reduce memory\n",
    "    load_in_8bit = False, \n",
    "    full_finetuning = False, \n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "id": "8d23825a",
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Unsloth 2025.4.7 patched 16 layers with 16 QKV layers, 16 O layers and 16 MLP layers.\n"
     ]
    }
   ],
   "source": [
    "model = FastLanguageModel.get_peft_model(\n",
    "    model,\n",
    "    r = 16, # Choose any number > 0 ! Suggested 8, 16, 32, 64, 128\n",
    "    target_modules = [\"q_proj\", \"k_proj\", \"v_proj\", \"o_proj\",\n",
    "                      \"gate_proj\", \"up_proj\", \"down_proj\",],\n",
    "    lora_alpha = 16,\n",
    "    lora_dropout = 0, # Supports any, but = 0 is optimized\n",
    "    bias = \"none\",    # Supports any, but = \"none\" is optimized\n",
    "    # [NEW] \"unsloth\" uses 30% less VRAM, fits 2x larger batch sizes!\n",
    "    use_gradient_checkpointing = \"unsloth\", # True or \"unsloth\" for very long context\n",
    "    random_state = 2025,\n",
    "    use_rslora = False,  # We support rank stabilized LoRA\n",
    "    loftq_config = None, # And LoftQ\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e2c3deb6",
   "metadata": {},
   "source": [
    "## Formatting the prompts\n",
    "We need to put everything together into a single 'text' field for the LLM to be trained on. According to the [RAFT paper](https://arxiv.org/abs/2403.10131), we add the context along with the question and chain-of-thought answer in a bid to help our LLM learn how to use the context to answer the question. Let's do that!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "id": "dbee4ea9",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "4702b474d09245d3b0d1e0f54b879ea5",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Map:   0%|          | 0/301 [00:00<?, ? examples/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "bd4c7ce455d84b8aa9289e8ba99f0fa8",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Map:   0%|          | 0/34 [00:00<?, ? examples/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "def formatting_prompts_func(examples):\n",
    "    \"\"\"Define a formatter that injects the retrieved context:\"\"\"\n",
    "    \n",
    "    texts = []\n",
    "    for qn, ctx, oracle, instr, ans in zip(\n",
    "        examples['question'],\n",
    "        examples[\"context\"],\n",
    "        examples[\"oracle_context\"],\n",
    "        examples[\"instruction\"],\n",
    "        examples[\"cot_answer\"]\n",
    "    ):\n",
    "        # You can choose to use `oracle_context` (gold) vs. `context` (retrieved)\n",
    "        # Here we show both, but you could just use `context`.\n",
    "        prompt = (\n",
    "            \"### Question:\\n\"\n",
    "            f\"{qn}\\n\\n\"\n",
    "            \"### Context:\\n\"\n",
    "            f\"{ctx}\\n\\n\"\n",
    "            \"### (Oracle Passages):\\n\"\n",
    "            f\"{oracle}\\n\\n\"\n",
    "            \"### Instruction:\\n\"\n",
    "            f\"{instr}\\n\\n\"\n",
    "            \"### Answer:\\n\"\n",
    "        )\n",
    "        # Append the gold answer plus EOS\n",
    "        texts.append(prompt + ans + tokenizer.eos_token)\n",
    "    return {\"text\": texts}\n",
    "\n",
    "# then:\n",
    "train_ds = train_ds.map(formatting_prompts_func, batched=True)\n",
    "eval_ds = eval_ds.map(formatting_prompts_func, batched=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e6795f14",
   "metadata": {},
   "source": [
    "Let's take a look at what we just did!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "id": "af2f6851",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/markdown": [
       "### Question:\n",
       "What is the mean value for the 'Data Science' variable in the LinkedIn matched sample?\n",
       "\n",
       "### Context:\n",
       "{'sentences': [['Table 1: Summary statistics pretreatment and outcome variables\\nCoursera Internal Data LinkedIn Matched Sample\\nVariable name Mean S.E. Mean S.E.\\nTreatment 0.499 0.001 0.500 0.003\\nPanel A: Pre-treatment covariates\\nProfessional Experience Years \u2013 \u2013 3.040 0.028\\nPast Tech Job \u2013 \u2013 0.127 0.002\\nPast Managerial Job \u2013 \u2013 0.064 0.001\\nMain Skill Absolute 0.099 0.001 2.074 0.010\\nMain Skill Standardized 0.000 <0.001 0.000 0.001\\nComputer Science 0.252 0.001 0.230 0.002\\nData Science 0.236 0.001 0.300 0.002\\nInformation Technology 0.140 0.001 0.138 0.002\\nGuided Project 0.168 0.001 0.097 0.002\\nProfessional Certificate 0.005 <0.001 0.005 <0.001\\nSpecialization 0.009 <0.001 0.009 0.001\\nDeveloping Country 0.896 0.001 0.850 0.002\\nAssociate Degree 0.029 <0.001 0.062 0.001\\nBachelor Degree 0.127 0.001 0.367 0.003\\nSome College 0.072 0.001 0.130 0.002\\nDoctorate Degree 0.004 <0.001 0.012 0.001\\nHigh School Diploma 0.059 0.001 0.097 0.002\\nLess than High School 0.009 <0.001 0.012 0.001\\nMasters Degree 0.050 0.001 0.146 0.002\\nNo Education Mentioned 0.645 0.002 0.164 0.002\\nProfessional Degree 0.004 <0.001 0.010 0.001\\nMale 0.302 0.002 0.674 0.002\\nGender Not Mentioned 0.533 0.002 0.101 0.002\\nPanel B: Outcome variables\\nNew Job \u2013 \u2013 0.177 0.002\\nNew Job in Scope \u2013 \u2013 0.133 0.002\\nCredential Shared \u2013 \u2013 0.181 0.002\\nAll Views 0.191 0.001 0.429 0.003\\nAll Views by Others 0.143 0.001 0.318 0.002\\nViews LinkedIn 0.165 0.001 0.409 0.003\\nViews LinkedIn by Others 0.124 0.001 0.296 0.002\\nNote: Professional Experience Years is the number of years between the starting date of the first job and August 2023. Past Tech Job\\ntakes the value of 1 when the learner had a job title related to technology before randomization and zero otherwise. ', 'effects between the bottom and top tertiles, the difference is 0.1 p.p. (S.E. ', 'For each learner, Coursera assesses skill mastery and assigns a score (Red-\\ndick, 2019). Additionally, we compute a max-mean standardization of the learners\u2019 skill level. We also\\nobserve the country where the learner registered for the course. Following the OECD classification,\\nwe use this information to group countries into developing and developed. Finally, we also observe\\nthe information provided by the learners in their registration survey. ']], 'title': [['placeholder_title', 'placeholder_title', 'placeholder_title']]}\n",
       "\n",
       "### (Oracle Passages):\n",
       "Table 1: Summary statistics pretreatment and outcome variables\n",
       "Coursera Internal Data LinkedIn Matched Sample\n",
       "Variable name Mean S.E. Mean S.E.\n",
       "Treatment 0.499 0.001 0.500 0.003\n",
       "Panel A: Pre-treatment covariates\n",
       "Professional Experience Years \u2013 \u2013 3.040 0.028\n",
       "Past Tech Job \u2013 \u2013 0.127 0.002\n",
       "Past Managerial Job \u2013 \u2013 0.064 0.001\n",
       "Main Skill Absolute 0.099 0.001 2.074 0.010\n",
       "Main Skill Standardized 0.000 <0.001 0.000 0.001\n",
       "Computer Science 0.252 0.001 0.230 0.002\n",
       "Data Science 0.236 0.001 0.300 0.002\n",
       "Information Technology 0.140 0.001 0.138 0.002\n",
       "Guided Project 0.168 0.001 0.097 0.002\n",
       "Professional Certificate 0.005 <0.001 0.005 <0.001\n",
       "Specialization 0.009 <0.001 0.009 0.001\n",
       "Developing Country 0.896 0.001 0.850 0.002\n",
       "Associate Degree 0.029 <0.001 0.062 0.001\n",
       "Bachelor Degree 0.127 0.001 0.367 0.003\n",
       "Some College 0.072 0.001 0.130 0.002\n",
       "Doctorate Degree 0.004 <0.001 0.012 0.001\n",
       "High School Diploma 0.059 0.001 0.097 0.002\n",
       "Less than High School 0.009 <0.001 0.012 0.001\n",
       "Masters Degree 0.050 0.001 0.146 0.002\n",
       "No Education Mentioned 0.645 0.002 0.164 0.002\n",
       "Professional Degree 0.004 <0.001 0.010 0.001\n",
       "Male 0.302 0.002 0.674 0.002\n",
       "Gender Not Mentioned 0.533 0.002 0.101 0.002\n",
       "Panel B: Outcome variables\n",
       "New Job \u2013 \u2013 0.177 0.002\n",
       "New Job in Scope \u2013 \u2013 0.133 0.002\n",
       "Credential Shared \u2013 \u2013 0.181 0.002\n",
       "All Views 0.191 0.001 0.429 0.003\n",
       "All Views by Others 0.143 0.001 0.318 0.002\n",
       "Views LinkedIn 0.165 0.001 0.409 0.003\n",
       "Views LinkedIn by Others 0.124 0.001 0.296 0.002\n",
       "Note: Professional Experience Years is the number of years between the starting date of the first job and August 2023. Past Tech Job\n",
       "takes the value of 1 when the learner had a job title related to technology before randomization and zero otherwise. \n",
       "\n",
       "### Instruction:\n",
       "<DOCUMENT>Table 1: Summary statistics pretreatment and outcome variables\n",
       "Coursera Internal Data LinkedIn Matched Sample\n",
       "Variable name Mean S.E. Mean S.E.\n",
       "Treatment 0.499 0.001 0.500 0.003\n",
       "Panel A: Pre-treatment covariates\n",
       "Professional Experience Years \u2013 \u2013 3.040 0.028\n",
       "Past Tech Job \u2013 \u2013 0.127 0.002\n",
       "Past Managerial Job \u2013 \u2013 0.064 0.001\n",
       "Main Skill Absolute 0.099 0.001 2.074 0.010\n",
       "Main Skill Standardized 0.000 <0.001 0.000 0.001\n",
       "Computer Science 0.252 0.001 0.230 0.002\n",
       "Data Science 0.236 0.001 0.300 0.002\n",
       "Information Technology 0.140 0.001 0.138 0.002\n",
       "Guided Project 0.168 0.001 0.097 0.002\n",
       "Professional Certificate 0.005 <0.001 0.005 <0.001\n",
       "Specialization 0.009 <0.001 0.009 0.001\n",
       "Developing Country 0.896 0.001 0.850 0.002\n",
       "Associate Degree 0.029 <0.001 0.062 0.001\n",
       "Bachelor Degree 0.127 0.001 0.367 0.003\n",
       "Some College 0.072 0.001 0.130 0.002\n",
       "Doctorate Degree 0.004 <0.001 0.012 0.001\n",
       "High School Diploma 0.059 0.001 0.097 0.002\n",
       "Less than High School 0.009 <0.001 0.012 0.001\n",
       "Masters Degree 0.050 0.001 0.146 0.002\n",
       "No Education Mentioned 0.645 0.002 0.164 0.002\n",
       "Professional Degree 0.004 <0.001 0.010 0.001\n",
       "Male 0.302 0.002 0.674 0.002\n",
       "Gender Not Mentioned 0.533 0.002 0.101 0.002\n",
       "Panel B: Outcome variables\n",
       "New Job \u2013 \u2013 0.177 0.002\n",
       "New Job in Scope \u2013 \u2013 0.133 0.002\n",
       "Credential Shared \u2013 \u2013 0.181 0.002\n",
       "All Views 0.191 0.001 0.429 0.003\n",
       "All Views by Others 0.143 0.001 0.318 0.002\n",
       "Views LinkedIn 0.165 0.001 0.409 0.003\n",
       "Views LinkedIn by Others 0.124 0.001 0.296 0.002\n",
       "Note: Professional Experience Years is the number of years between the starting date of the first job and August 2023. Past Tech Job\n",
       "takes the value of 1 when the learner had a job title related to technology before randomization and zero otherwise. </DOCUMENT>\n",
       "<DOCUMENT>effects between the bottom and top tertiles, the difference is 0.1 p.p. (S.E. </DOCUMENT>\n",
       "<DOCUMENT>For each learner, Coursera assesses skill mastery and assigns a score (Red-\n",
       "dick, 2019). Additionally, we compute a max-mean standardization of the learners\u2019 skill level. We also\n",
       "observe the country where the learner registered for the course. Following the OECD classification,\n",
       "we use this information to group countries into developing and developed. Finally, we also observe\n",
       "the information provided by the learners in their registration survey. </DOCUMENT>\n",
       "What is the mean value for the 'Data Science' variable in the LinkedIn matched sample?\n",
       "\n",
       "### Answer:\n",
       "assistant: To answer the question about the mean value for the 'Data Science' variable in the LinkedIn matched sample, we need to follow these steps:\n",
       "\n",
       "1. **Identify the Relevant Section**: The context provides a table with summary statistics for various variables. We need to locate the section that pertains to the LinkedIn matched sample.\n",
       "\n",
       "2. **Locate the 'Data Science' Variable**: Within the table, find the row labeled 'Data Science'. This row will contain the mean value for the 'Data Science' variable in both the Coursera Internal Data and the LinkedIn Matched Sample.\n",
       "\n",
       "3. **Extract the Mean Value for LinkedIn Matched Sample**: Once the 'Data Science' row is located, extract the mean value specifically for the LinkedIn matched sample.\n",
       "\n",
       "Let's proceed with these steps:\n",
       "\n",
       "- The context provides a table with two columns of mean values: one for Coursera Internal Data and another for LinkedIn Matched Sample.\n",
       "\n",
       "- ##begin_quote## Data Science 0.236 0.001 0.300 0.002 ##end_quote##: This line from the context shows the mean values for the 'Data Science' variable. The first mean value (0.236) corresponds to the Coursera Internal Data, and the second mean value (0.300) corresponds to the LinkedIn Matched Sample.\n",
       "\n",
       "Therefore, the mean value for the 'Data Science' variable in the LinkedIn matched sample is 0.300.\n",
       "\n",
       "<ANSWER>: 0.300<|eot_id|>"
      ],
      "text/plain": [
       "<IPython.core.display.Markdown object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "from IPython.display import display, Markdown\n",
    "\n",
    "display(Markdown(pd.DataFrame(train_ds).head()['text'].iloc[0]))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e73a772a",
   "metadata": {},
   "source": [
    "### And now we finally get to training!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "id": "e91d643e",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "065a8f099e084b70a9b554628248a109",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Unsloth: Tokenizing [\"text\"] (num_proc=4):   0%|          | 0/301 [00:00<?, ? examples/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "1fa1652bdf324656b058a3ac21bd2dc9",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Unsloth: Tokenizing [\"text\"] (num_proc=4):   0%|          | 0/34 [00:00<?, ? examples/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "from trl import SFTTrainer\n",
    "from transformers import TrainingArguments\n",
    "\n",
    "training_args = TrainingArguments(\n",
    "    output_dir=\"llama32_1bn_raft_v2\", #This will also be used as your huggingfacehub model id name\n",
    "    report_to=\"wandb\", #Leave this to be blank if you don't want to use wandb\n",
    "    run_name=\"RAFT_SFT_Take7\",\n",
    "    eval_steps=5,\n",
    "    eval_strategy=\"steps\",\n",
    "    per_device_train_batch_size=1,    # small batches if quantized\n",
    "    per_device_eval_batch_size=1,\n",
    "    gradient_accumulation_steps=8,\n",
    "    learning_rate=2e-5,\n",
    "    num_train_epochs=5,\n",
    "    # max_steps=60,                    # or set num_train_epochs\n",
    "    save_strategy=\"no\",\n",
    "    gradient_checkpointing=True,\n",
    "    logging_strategy=\"steps\",\n",
    "    logging_steps=5,\n",
    "    seed=42,\n",
    "    optim=\"adamw_torch\",\n",
    "    lr_scheduler_type=\"cosine\",\n",
    ")\n",
    "\n",
    "trainer = SFTTrainer(\n",
    "    model = model,\n",
    "    tokenizer = tokenizer,\n",
    "    train_dataset = train_ds,\n",
    "    eval_dataset = eval_ds, \n",
    "    args=training_args,\n",
    "    dataset_text_field=\"text\",\n",
    "    \n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b05b2b5a",
   "metadata": {},
   "source": [
    "Current memory statistics"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "id": "09fdea09",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "GPU = NVIDIA A10G. Max memory = 22.184 GB.\n",
      "1.457 GB of memory reserved.\n"
     ]
    }
   ],
   "source": [
    "gpu_stats = torch.cuda.get_device_properties(0)\n",
    "start_gpu_memory = round(torch.cuda.max_memory_reserved() / 1024 / 1024 / 1024, 3)\n",
    "max_memory = round(gpu_stats.total_memory / 1024 / 1024 / 1024, 3)\n",
    "print(f\"GPU = {gpu_stats.name}. Max memory = {max_memory} GB.\")\n",
    "print(f\"{start_gpu_memory} GB of memory reserved.\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "id": "9adf6997",
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "==((====))==  Unsloth - 2x faster free finetuning | Num GPUs used = 1\n",
      "   \\\\   /|    Num examples = 301 | Num Epochs = 5 | Total steps = 185\n",
      "O^O/ \\_/ \\    Batch size per device = 1 | Gradient accumulation steps = 8\n",
      "\\        /    Data Parallel GPUs = 1 | Total batch size (1 x 8 x 1) = 8\n",
      " \"-____-\"     Trainable parameters = 11,272,192/1,000,000,000 (1.13% trained)\n",
      "\u001b[34m\u001b[1mwandb\u001b[0m: Currently logged in as: \u001b[33mtituslhy\u001b[0m to \u001b[32mhttps://api.wandb.ai\u001b[0m. Use \u001b[1m`wandb login --relogin`\u001b[0m to force relogin\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "Tracking run with wandb version 0.19.11"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "Run data is saved locally in <code>/home/ubuntu/ideal-palm-tree/notebooks/wandb/run-20250521_061652-hc9ebbef</code>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "Syncing run <strong><a href='https://wandb.ai/tituslhy/huggingface/runs/hc9ebbef' target=\"_blank\">RAFT_SFT_Take7</a></strong> to <a href='https://wandb.ai/tituslhy/huggingface' target=\"_blank\">Weights & Biases</a> (<a href='https://wandb.me/developer-guide' target=\"_blank\">docs</a>)<br>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       " View project at <a href='https://wandb.ai/tituslhy/huggingface' target=\"_blank\">https://wandb.ai/tituslhy/huggingface</a>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       " View run at <a href='https://wandb.ai/tituslhy/huggingface/runs/hc9ebbef' target=\"_blank\">https://wandb.ai/tituslhy/huggingface/runs/hc9ebbef</a>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Unsloth: Will smartly offload gradients to save VRAM!\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "\n",
       "    <div>\n",
       "      \n",
       "      <progress value='185' max='185' style='width:300px; height:20px; vertical-align: middle;'></progress>\n",
       "      [185/185 10:29, Epoch 4/5]\n",
       "    </div>\n",
       "    <table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       " <tr style=\"text-align: left;\">\n",
       "      <th>Step</th>\n",
       "      <th>Training Loss</th>\n",
       "      <th>Validation Loss</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td>5</td>\n",
       "      <td>1.493000</td>\n",
       "      <td>1.633143</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>10</td>\n",
       "      <td>1.466600</td>\n",
       "      <td>1.617843</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>15</td>\n",
       "      <td>1.546300</td>\n",
       "      <td>1.596143</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>20</td>\n",
       "      <td>1.485900</td>\n",
       "      <td>1.571562</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>25</td>\n",
       "      <td>1.449800</td>\n",
       "      <td>1.546785</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>30</td>\n",
       "      <td>1.426500</td>\n",
       "      <td>1.521693</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>35</td>\n",
       "      <td>1.446800</td>\n",
       "      <td>1.497457</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>40</td>\n",
       "      <td>1.376700</td>\n",
       "      <td>1.474485</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>45</td>\n",
       "      <td>1.334400</td>\n",
       "      <td>1.454567</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>50</td>\n",
       "      <td>1.365500</td>\n",
       "      <td>1.434021</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>55</td>\n",
       "      <td>1.316800</td>\n",
       "      <td>1.413398</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>60</td>\n",
       "      <td>1.372000</td>\n",
       "      <td>1.392783</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>65</td>\n",
       "      <td>1.300700</td>\n",
       "      <td>1.373677</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>70</td>\n",
       "      <td>1.275900</td>\n",
       "      <td>1.352113</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>75</td>\n",
       "      <td>1.247600</td>\n",
       "      <td>1.334677</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>80</td>\n",
       "      <td>1.229100</td>\n",
       "      <td>1.317328</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>85</td>\n",
       "      <td>1.204400</td>\n",
       "      <td>1.301929</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>90</td>\n",
       "      <td>1.168800</td>\n",
       "      <td>1.288014</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>95</td>\n",
       "      <td>1.255500</td>\n",
       "      <td>1.275806</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>100</td>\n",
       "      <td>1.246700</td>\n",
       "      <td>1.264712</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>105</td>\n",
       "      <td>1.135600</td>\n",
       "      <td>1.254323</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>110</td>\n",
       "      <td>1.122600</td>\n",
       "      <td>1.242589</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>115</td>\n",
       "      <td>1.171900</td>\n",
       "      <td>1.239755</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>120</td>\n",
       "      <td>1.166100</td>\n",
       "      <td>1.233676</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>125</td>\n",
       "      <td>1.179500</td>\n",
       "      <td>1.228176</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>130</td>\n",
       "      <td>1.090000</td>\n",
       "      <td>1.223785</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>135</td>\n",
       "      <td>1.133300</td>\n",
       "      <td>1.220154</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>140</td>\n",
       "      <td>1.124100</td>\n",
       "      <td>1.216492</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>145</td>\n",
       "      <td>1.119700</td>\n",
       "      <td>1.214525</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>150</td>\n",
       "      <td>1.122200</td>\n",
       "      <td>1.210105</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>155</td>\n",
       "      <td>1.116500</td>\n",
       "      <td>1.210061</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>160</td>\n",
       "      <td>1.118800</td>\n",
       "      <td>1.210211</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>165</td>\n",
       "      <td>1.135800</td>\n",
       "      <td>1.208962</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>170</td>\n",
       "      <td>1.110000</td>\n",
       "      <td>1.208903</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>175</td>\n",
       "      <td>1.113200</td>\n",
       "      <td>1.208125</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>180</td>\n",
       "      <td>1.147100</td>\n",
       "      <td>1.208712</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>185</td>\n",
       "      <td>1.133800</td>\n",
       "      <td>1.208498</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table><p>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Unsloth: Not an error, but LlamaForCausalLM does not accept `num_items_in_batch`.\n",
      "Using gradient accumulation will be very slightly less accurate.\n",
      "Read more on gradient accumulation issues here: https://unsloth.ai/blog/gradient\n"
     ]
    }
   ],
   "source": [
    "trainer_stats = trainer.train()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "718da831",
   "metadata": {},
   "source": [
    "Used memory statistics"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "98003bb9",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "637.9309 seconds used for training.\n",
      "10.63 minutes used for training.\n",
      "Peak reserved memory = 2.156 GB.\n",
      "Peak reserved memory for training = 0.699 GB.\n",
      "Peak reserved memory % of max memory = 9.719 %.\n",
      "Peak reserved memory for training % of max memory = 3.151 %.\n"
     ]
    }
   ],
   "source": [
    "used_memory = round(torch.cuda.max_memory_reserved() / 1024 / 1024 / 1024, 3)\n",
    "used_memory_for_lora = round(used_memory - start_gpu_memory, 3)\n",
    "used_percentage = round(used_memory / max_memory * 100, 3)\n",
    "lora_percentage = round(used_memory_for_lora / max_memory * 100, 3)\n",
    "print(f\"{trainer_stats.metrics['train_runtime']} seconds used for training.\")\n",
    "print(\n",
    "    f\"{round(trainer_stats.metrics['train_runtime']/60, 2)} minutes used for training.\"\n",
    ")\n",
    "print(f\"Peak reserved memory = {used_memory} GB.\")\n",
    "print(f\"Peak reserved memory for training = {used_memory_for_lora} GB.\")\n",
    "print(f\"Peak reserved memory % of max memory = {used_percentage} %.\")\n",
    "print(f\"Peak reserved memory for training % of max memory = {lora_percentage} %.\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "36e8502e",
   "metadata": {},
   "source": [
    "<a name=\"Save\"></a>\n",
    "### Saving to float16 for VLLM\n",
    "\n",
    "We also support saving to `float16` directly. Select `merged_16bit` for float16 or `merged_4bit` for int4. We also allow `lora` adapters as a fallback. Use `push_to_hub_merged` to upload to your Hugging Face account! You can go to https://huggingface.co/settings/tokens for your personal tokens."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "11ffb85d",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Merge to 16bit\n",
    "if False: model.save_pretrained_merged(\"model\", tokenizer, save_method = \"merged_16bit\",)\n",
    "if False: model.push_to_hub_merged(\"hf/model\", tokenizer, save_method = \"merged_16bit\", token = \"\")\n",
    "\n",
    "# Merge to 4bit\n",
    "if False: model.save_pretrained_merged(\"model\", tokenizer, save_method = \"merged_4bit\",)\n",
    "if False: model.push_to_hub_merged(\"hf/model\", tokenizer, save_method = \"merged_4bit\", token = \"\")\n",
    "\n",
    "# Just LoRA adapters\n",
    "if False:\n",
    "    model.save_pretrained(\"model\")\n",
    "    tokenizer.save_pretrained(\"model\")\n",
    "if False:\n",
    "    model.push_to_hub(\"hf/model\", token = \"\")\n",
    "    tokenizer.push_to_hub(\"hf/model\", token = \"\")\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f4a73633",
   "metadata": {},
   "source": [
    "### GGUF / llama.cpp Conversion\n",
    "To save to `GGUF` / `llama.cpp`, we support it natively now! We clone `llama.cpp` and we default save it to `q8_0`. We allow all methods like `q4_k_m`. Use `save_pretrained_gguf` for local saving and `push_to_hub_gguf` for uploading to HF.\n",
    "\n",
    "Some supported quant methods (full list on our [Wiki page](https://github.com/unslothai/unsloth/wiki#gguf-quantization-options)):\n",
    "* `q8_0` - Fast conversion. High resource use, but generally acceptable.\n",
    "* `q4_k_m` - Recommended. Uses Q6_K for half of the attention.wv and feed_forward.w2 tensors, else Q4_K.\n",
    "* `q5_k_m` - Recommended. Uses Q6_K for half of the attention.wv and feed_forward.w2 tensors, else Q5_K.\n",
    "\n",
    "[**NEW**] To finetune and auto export to Ollama, try our [Ollama notebook](https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Llama3_(8B)-Ollama.ipynb)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "db4e918e",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Save to 8bit Q8_0\n",
    "if False: model.save_pretrained_gguf(\"model\", tokenizer,)\n",
    "# Remember to go to https://huggingface.co/settings/tokens for a token!\n",
    "# And change hf to your username!\n",
    "if False: model.push_to_hub_gguf(\"hf/model\", tokenizer, token = \"\")\n",
    "\n",
    "# Save to 16bit GGUF\n",
    "if False: model.save_pretrained_gguf(\"model\", tokenizer, quantization_method = \"f16\")\n",
    "if False: model.push_to_hub_gguf(\"hf/model\", tokenizer, quantization_method = \"f16\", token = \"\")\n",
    "\n",
    "# Save to q4_k_m GGUF\n",
    "if False: model.save_pretrained_gguf(\"model\", tokenizer, quantization_method = \"q4_k_m\")\n",
    "if False: model.push_to_hub_gguf(\"hf/model\", tokenizer, quantization_method = \"q4_k_m\", token = \"\")\n",
    "\n",
    "# Save to multiple GGUF options - much faster if you want multiple!\n",
    "if False:\n",
    "    model.push_to_hub_gguf(\n",
    "        \"hf/model\", # Change hf to your username!\n",
    "        tokenizer,\n",
    "        quantization_method = [\"q4_k_m\", \"q8_0\", \"q5_k_m\",],\n",
    "        token = \"\",\n",
    "    )"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now, use the `model-unsloth.gguf` file or `model-unsloth-Q4_K_M.gguf` file in llama.cpp or a UI based system like Jan or Open WebUI. You can install Jan [here](https://github.com/janhq/jan) and Open WebUI [here](https://github.com/open-webui/open-webui)\n",
    "\n",
    "And we're done! If you have any questions on Unsloth, we have a [Discord](https://discord.gg/unsloth) channel! If you find any bugs or want to keep updated with the latest LLM stuff, or need help, join projects etc, feel free to join our Discord!\n",
    "\n",
    "Some other links:\n",
    "1. Train your own reasoning model - Llama GRPO notebook [Free Colab](https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Llama3.1_(8B)-GRPO.ipynb)\n",
    "2. Saving finetunes to Ollama. [Free notebook](https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Llama3_(8B)-Ollama.ipynb)\n",
    "3. Llama 3.2 Vision finetuning - Radiography use case. [Free Colab](https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Llama3.2_(11B)-Vision.ipynb)\n",
    "6. See notebooks for DPO, ORPO, Continued pretraining, conversational finetuning and more on our [documentation](https://docs.unsloth.ai/get-started/unsloth-notebooks)!\n",
    "\n",
    "<div class=\"align-center\">\n",
    "  <a href=\"https://unsloth.ai\"><img src=\"https://github.com/unslothai/unsloth/raw/main/images/unsloth%20new%20logo.png\" width=\"115\"></a>\n",
    "  <a href=\"https://discord.gg/unsloth\"><img src=\"https://github.com/unslothai/unsloth/raw/main/images/Discord.png\" width=\"145\"></a>\n",
    "  <a href=\"https://docs.unsloth.ai/\"><img src=\"https://github.com/unslothai/unsloth/blob/main/images/documentation%20green%20button.png?raw=true\" width=\"125\"></a>\n",
    "\n",
    "  Join Discord if you need help + \u2b50\ufe0f <i>Star us on <a href=\"https://github.com/unslothai/unsloth\">Github</a> </i> \u2b50\ufe0f\n",
    "</div>\n"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": ".venv",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.8"
  },
  "accelerator": "GPU",
  "colab": {
   "provenance": [],
   "gpuType": "T4",
   "include_colab_link": true
  },
  "widgets": {
   "application/vnd.jupyter.widget-state+json": {
    "state": {}
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}